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Abstract 


The Year2000 Problem arises in systems which use two-digits to store year infor- 
mation in the date field and do not explicitly store the century information. The 
century is implicit and assumed to be T9’. These compact date-formats work well 
from year 1900 to year 1999, because most of the programs work under the same 
assumption. However these systems will fail to operate correctly in year 2000 and 
beyond. 

The solutions proposed to solve this problem require change in variable declara- 
tions, that store date-value and sometime the parts of the code which handle such 
variables. It is a widely accepted fact that a complete automated solution to the 
problem is not possible. At the same time it is also true that the programmers 
alone cannot do the required changes. Therefore any tool that helps even in partial 
automation is a welcome step. 

In this thesis a technique called impact analysis has been proposed to identify all 
the date-related variables. The technique is based on the reaching definition analysis 
and du- chain construction technique commonly used in the optimization phase of 
compilers. A tool which implements the technique has also been developed. 
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Chapter 1 
Introduction 


1.1 The Year2000 Problem 

The Year2000 problem is an interesting phenomenon from both technical and socio- 
logical point of view. At the technical level, it concerns with the widespread practice 
of storing year in dates in two-digits format. For example, year 1998 is represented 
as ‘98’ only. As a result at the end of 20th century, many software applications will 
stop working or produce erroneous results. At the sociological level, it is interesting 
to see how individuals and organizations react to the crisis. However, we will only 
concern ourselves with the technical aspect of the problem. 

The problem leading to Year2000 challenge is really very easy to grasp. Yet, 
its consequences are very grave. Most programs and databases use two-digits and 
do not store the century information. For example, the Gregorian date format 
CCYY/MM/DD is represented as YY/MM/DD format and the Julian date format 
DDD/CCYY is represented as DDD/YY format. The century is assumed to be T9’ 
and not explicitly mentioned. The problem is mainly in legacy systems, where mem- 
ory is considered to be very precious and programmers made such an assumption in 
order to save storing space. In some systems the assumption was made just to save 
a few key-strokes by the data entry persons. This compact date format works well 
from year 1900 to year 1999 because most programs operate on the same assumption; 
but it will fail for year 2000 and beyond. This problem is synonymously referred 
to as Y2K problem. Century Date Conversion problem(CDC problem), Millennium 
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problem etc. 


1.2 The Year2000 Problem Exposure 

The software systems, where the year detail is represented by just two-digits, are 
exposed to the Year2000 problem. Hardwares that do not support four-digits year 
field format will also be affected. For example, PC-BIOS stores year field in CMOS- 
ROM in two-digits format. There are bar-code systems which use two-digits year 
format as part of their schemas. The Year2000 problem will also aflfect those sys- 
tems which read and use these bar-codes. In our world where the systems are highly 
interconnected through network; it has been predicted that systems, which although 
do not have Year2000 problem in its own environment, will be exposed to the prob- 
lem by other Year2000 problem affected systems on the network in the form of data 
transfer [8, 15]. 

The Year2000 problem is further compounded with numerous variations of date 
representation and mathematical calculations done on these dates. The following 
section gives few of the classifications of the Year2000 problem exposure. 

1.2.1 The Year2000 Exposure Classification 

• Incorrect century 

The systems, which use two-digits year-field, assume the century field to be 
‘19’. These systems ignore the century field during date entry or update. Hard 
copy output of these systems put ‘19’ in the century field by default. 

• Incorrect field format 

Date formats such as YY/MM/DD (Gregorian) or DDD/YY( Julian) bind the 
system to operate within a fixed 100-years window ranging from year 1900 to 
1999. 

• Arithmetic Calculation 

Arithmetic with the two-digits year representation cannot work outside the 
year range 1900-1999. It will produce anomalous results on and beyond year 
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2000. For example, difference between year 2010 (represented as ‘10’) and 
year 1990 (represented as ‘90’) is calculated as (10 - 90) = -80 years, where 
the correct result is (2010 - 1990) = 20 years. 

• Leap year calculation 

Year 2000 is a leap year and is represented as ‘00’. Since the default century 
value is ‘19’, computers will see year 2000 as year 1900, which is not a leap year. 
Potential exposures caused by the identification of year 2000 as non-leap-year 
are [6]: 

- Day-in-year calculation. The year 2000 has 366 days and not 365 days. 

- Day-of-the-week calculation. February 28, 2000 is a Monday and March 1, 
2000 is a Wednesday. Systems will fail to calculate correctly the day-of- 
the-week. For example, the ADA language relate database exhibit strange 
Year2000 problem [19]. It declares year 2000 as leap year and seems to 
calculate correctly the day-of-the-week February 29, 2000, but fails to 
calculate correctly that of March 01, 2000. 

• Data Integrity 

The Year2000 problem affected systems will see year 2000 as year 1900 and 
will fail to distinguish between events occurring in the year 1900 and 2000. 
Events occurring in the year 2010 will appear to have already occurred in the 
year 1910. Time will appear to have reversed. 

• Sequence 

Sorting with date as the key value will produce erroneous result. Records of 
year 2000 (represented as ‘00’) will appear before year 1999 (represented as ‘99’) 
in sorted sequence. 

• Year value with special meaning 

In some systems some year values are assigned with special meaning and the 
meaning is hard-coded in the code. For example, systems may treat ‘99’ in 
year field as ‘date value not available’, or ‘00’ in year field may signify ‘this 
record has been expired’, etc. 
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• The Century Rollover 

The century rollover means the transition of a system from one century to the 
next. For example, century rollover for 21st century means transition of the 
system-date from 31st December, 1999 to 1st January, 2000. Systems which 
are affected by Year2000 problem very often fail in 21st century rollover and 
they are said to be showing century rollover syndrome. 

Leon A. Kappleman has coined the term DRAGON to represent the three possible 
outcomes of the Year2000 problem [14]. DRAGON is Date Related Abend, Garbage 
Or Nothing. The term ‘Date Related’ is used because of the nature of the problem. 
The problem has three consequences on the systems. 

i) Abend, which in computer linguistic means total shutdown of the system. This 
is the extreme impact that can occur to the system. 

ii) The system will produce erroneous outputs i. e. Garbage. 

iii) In some systems the problem will do Nothing. 

1.3 Solutions and Techniques 

All the solutions proposed to solve the Year2000 problem require changes in the 
program and/or databases. The process of incorporating the required changes must 
be systematic and is commonly referred to as YearBOOO conversion process. The 
solutions proposed for the problem can be classified into three categories, namely 
Data approach, Procedural approach and Encoding/Compression approach. All 
the three approaches along with their advantages and disadvantages are discussed 
below [1, 8, 9, 10]. 

1.3.1 Data approach 

The data approach involves expansion of the date fields from two-digits year format 
(YY) to four-digits year format (CCYY) to include century information, both in 
source code as well as in stored data. The solution requires changes in all the data 
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files and databases and also the application programs that refer to or use the changed 
databases. 

Advantages 

• This is the ideal solution as the converted application will survive upto the 
year 9999. 

• This approach requires easier code upgradation effort and results in simpler 
date logic in programs, i.e. conversion effort is minimized as most changes will 
be confined to the data declaration part. 

• This conversion approach is both consistent and accurate. It eliminates two- 
digits year ambiguity and Year2000 rollover issue simultaneously. Moreover, 
this approach ensures consistency with newly developed century-compliant 
systems. 

Disadvantages 

• This method requires conversion of virtually all programs of the system. 

• Since the conversion of the programs and the data must occur simultaneously 
this approach requires very careful project management. The process may 
force shutdown of the system until the whole conversion process is over. 

• Data conversion process in this approach may be costly. Archived data may 
also need conversion or support by special logic. The process may become 
complex, potentially needing support for the new and old date formats simul- 
taneously. 

1.3.2 Procedural approach 

The procedural approach is based on the observation that two-digits year field nat- 
urally offers us a 100-years window. Since these systems assume century to be T9’, 
two-digits year field gives them a fixed window ranging from year 1900 to 1999, in 
which these systems work properly. The base year in this case is 1900. This ap- 
proach involves changes only in the source codes to incorporate proper logic such 
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that any value can be imposed as the base year and the century of the two-digits 
year be correctly interpreted. This solution does not require expansion of the year 
field neither in data nor in the declaration part of the code. 

There are two variations of this scheme. 

• Fixed Window Technique 

• Sliding Window Technique 

I Fixed Window Technique 

The fixed window technique uses a static 100-years interval that generally spans 
over century boundary. For example the window can be 1964-2063, where the base 
year is 1964. The application system can be modified to infer the century of the 
two-digits year based on the following logic: 

if year >= 64 then century of the year is 19. 
if year <= 63 then century of the year is 20. 

I Sliding Window Technique 

In this case the 100-years window is dynamic. The technique uses a self-advancing 
100 years interval that generally crosses the century boundary. The user specifies 
the number of years in the past and the number of years in the future relative to 
the system date(generally the current year). The system maintains the 100-years 
window for the data. The main advantage of the approach is that the window is 
automatically advanced without any programming change. 

This technique is suitable for applications which process data of a predefined 
interval. The sliding window technique can follow the following logic to infer century 
of the year. The example assumes past year number is 60 and the future year number 
is 40. 


In 1996, If year >= 36 then century of the year is 19. 

If year <= 35 then century of the year is 20. 
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In 1997, If year >= 37 then century of the year is 19. 

If year <= 36 then century of the year is 20. 

Advantages 

• No data conversion required in this technique. Existing data and archive data 
remain unexpanded. 

• Programs can be changed and tested in isolation from other interdependent 
systems. 

• The exposure to change is limited to the parts of the program where the logic 
is modified. 

Disadvantages 

• Some programs may require extensive code changes, because they may require 
extra logic to cater for century rollover and also because the century inference 
process can be complex. 

• The solution fails in systems which operate within a span of 100 years or more. 

• The solution may require data-bridge in case of date-exchange with systems 
which use four-digits year field. 

• Sorting complexity may increase in this approach. 

1.3.3 Encoding/ Compression approach 

The previous two solutions neglects the fundamental cause of the Year2000 prob- 
lem, i.e. storage overfiow. Computers can store much more information in current 
date-formats (YYMMDD or YYDDD) than they are allowed to do. The Encod- 
ing/Compression method use the existing storage space for date to store more data 
in the form of more concise date-representation. The main advantage of the schema 
is that in doing so, we get a much larger window than 100-years. There are many 
encoding/compression techniques available [1, 18]. Some examples are given below: 
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• The CYYDDD Format : The existing YYMMDD is converted to CYYDDD 
format. This scheme can give an year- window of 1000 years long. 

• The DDDDDD Format : In this case YYMMDD format is replaced by DDDDDD 
format. This representation can store 1 million days. The stored value is the 
number days elapsed since some base date, say January 1, 1900. Then it can 
track dates past year 4600. 

• Conversion of the number representation schema from decimal to hexadecimal. 
Two hexadecimal-digits can provide an year- window of 255 years. 

• Conversion of the year data type from the two-bytes character representation 
of the two-digits year to a one-byte unsigned packed decimal representation; 
and using the free byte to store two-digits century field as unsigned packed 
decimal. 

The solution requires changes in both the data and the programs. It also requires 
simultaneous conversion of data and all the applications which refer to or use the 
converted data. 

Advantages 

• This approach does not require expansion of two-digits year format to four- 
digits year format, yet it gives the system a larger year-window to operate in 
comparison to the ‘procedural approach’. 

Disadvantages 

• The approach requires simultaneous changes in all the program units and data. 

• Although the schema does not require expansion of date fields, it requires 
updating of data. At the same time it requires extra level of processing in all 
the places in the code where the encoded year field is being used. 
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1.4 Problem Definition 

All the approaches to the solution of the Year2000 problem require identification 
and modification of date- variables as well as those parts of the program which han- 
dle those date-variables. Modification in any part of the program has a rippling 
effect throughout the program, i.e. such changes affect other parts in the program 
including other variables. For example, consider the following piece of code; 


01 

J0IN_DATE 

DATE 

pic 

9(6) 

01 

A 

INT 

pic 

9(6) 

01 

B 

INT 

pic 

9(6) 


MOVE JOIN.DATE TO A. 


MOVE A TO B. 

Now assume that the date- variable JOIN-DATE has been expanded to eight-digits. 
Since the variable A can hold only six-digits of information, two-digits of information 
will be lost at the MOVE statement. This implies that in order to retain the complete 
information, variable A should also be expanded to eight-digits. In this case variable 
A is under the direct impact of the date-variable JOIN-DATE. Similar argument can 
be applied to variable B also. Here B is under the direct impact of the variable A 
and under the indirect impact of the variable JOIN-DATE. 

Form the above example it is clear that changes are required for both date- 
variables and non-date- variables, which interact directly or indirectly with the date- 
variables. Therefore we require to identify all the variables those are likely to obtain 
year value and also those part of the code which are likely to manipulate on such 
value. The process of identification of such variables is called Impact Analysis. 
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As argued by Peter de Jager in his article Biting the Silver Bullet [7] no fully 
automated solution to the problem, which he refers to as silver bullet, is possible. 
He has identified more than a dozen of challenges a sliver bullet has to overcome, 
and it is unlikely that a single tool can overcome all of them. At the same time 
it is also apparent that only human intervention for the conversion process is not 
practical. Therefore tools those help in Year2000 conversion process are required. 

1.5 Existing Tools 

The Year2000 problem has captured a world-wide attention, accompanied with 
bizarre predictions. Many existing companies started addressing the problem and 
many service companies were established with their only attention to the problem. 
As a result numerous tools have come up. They address various aspects of the 
conversion process of the Year2000 problem. Nicholas Zvegintzov, in his article A 
Resource Guide to Year 2000 Problem [22], has classified the specialized Y2K tools 
along six perspective: 

1. Inventory Analysis. Tools that support the task of identifying the executable 
software inventory. 

2. Recovering source from object. Tools for analyzing code for which the source 
cannot be reliably identified. 

3. Project estimation. Tools for estimating the work load of Y2K conversion 
process. 

4. Analysis and conversion. Tools for finding and changing code or data struc- 
tures for Y2K conversion. 

5. Time simulators. Testing tools which simulate and modify clock time to pro- 
vide testing environment. 

6. Date libraries. Libraries for standard date formats and date calculations. 
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In this thesis we are concerned only with the tools for analysis and conversion. 
Analysis at the code level is very important for Year2000 conversion process. The 
Year2000 conversion process, as discussed earlier, requires identification and modifi- 
cation of date-variables. Analysis requires identification of system components and 
how they interact with data and among themselves. This makes the code modifica- 
tion a secondary job. There are many tool sets having specialization for date and 
time analysis. There are actually few tools which support direct conversion/ change 
of code. Here are few example of tools of this kind [1, 3, 4, 21, 22]. 

• Analyzer 2000 is a simple inventory tool that scans file for date patterns, flags 
them and makes them editable. The tool works under MVS and OS/2 environ- 
ment with Cobol, JCL and RPG as input language. The tool is from Ironsoft 
Inc., Madison, WI. 

• CA-lmpact/2000 is a tool with capability of trace, inventory and edit potential 
date-related fields. Limited impact analysis is done within one program unit. 
The tool is marketed by Computer Associates International, NY. 

• Cayenne 2000 is an analysis tool from Cayenne Software Inc. The tool finds 
potential date fields and performs limited impact analysis. 

• Century File Conversion locates and inventories date-related fields in source 
code. The tool is marketed by Quintic Systems. 


1.6 Scope of the Thesis 

As discussed earlier the emphasis of the Year2000 conversion is on the analysis of 
the programs. The objective of analysis of programs is the identification of date 
and non-date variables which may hold date-values during program execution. At 
the same time the analysis process reveals the places of the programs where such 
variables are being used. Obviously these places are the potential places to carry 
out changes for Year2000 conversion. We have proposed a technique called Impact 
Analysis for identification of variables which may hold date-values during program 
execution and also the places in the program where the variables are being used. 
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Impact analysis of a given variable traces the path of a data that can be associated 
with the given variable. If the variable given is a date- variable, then the technique 
identifies the path of a date-value, originating from the specified date- variable. An 
editing tool has also been implemented based on the impact analysis technique. 
Brief discussions on the following chapters are given below: 

• Chapter 2 explains the impact analysis technique along with its theoretical 
background. It includes discussions on control-flow graph, reaching definition 
analysis and du-chain which lead to impact analysis. 

• Chapter 3 discusses the functionalities and high level design for the Power 
Editor. It also discusses about the individual modules of the tool. 

• Chapter 4 includes results, along with discussion of the environment in which 
the tool runs, code size and testing results. 

• Chapter 5 draws the conclusion. 

An user’s manual for the tool is appended as appendix A at the end of this thesis. 
Few test results of the tool are included as appendix B. 


12 



Chapter 2 

Impact Analysis Technique 

2.1 Introduction 

The impact analysis technique forms the back-bone of our tool. The technique is 
based on the data-flow analysis, which is a common method used for the code opti- 
mization in compiler domain [2, 11]. More specifically we are using the live variable 
analysis and du-chain construction methods as the basis of the technique. Discus- 
sions on the mentioned techniques assumes that the input program has been reduced 
to its equivalent intermediate representation. Prom this point onward we shall as- 
sume that the techniques are applied on input programs in 3-address intermediate 
code and we shall refer to each 3-address code as a statement. 

2.2 Basic Block 

A basic block is a sequence of consecutive statements in which flow-of-control enters 
at the beginning and leaves at the end, and no other flow-of-control branches in or 
out from the basic block. 

The algorithm for identification of basic blocks from a program is given in the 
following page: 
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Algorithm : Partition of basic blocks. 

Input : A sequence of 3-address statements. 

Output : A list of basic blocks with each 3-address statement in exactly one 
block. 

Method : 

1. We first determine the set of leaders, the first statements of basic blocks. 
The rules are as follows; 

i) The first statement is a leader. 

ii) Any statement that is the target of a conditional or an unconditional 
‘goto’ statement is a leader. 

iii) Any statement that immediately follows a ‘goto’ or a conditional 
‘goto’ statement is a leader. 

2. For each leader, its basic block consists of the leader and all statements 
upto but not including the next leader or the end of the program. 

Figure 1 shows the basic blocks identified using the above algorithm. 


i=0; 

i=0; 

IT* 

II 

t1=(i==10); 

JZR L2,t1; 

JZR L2,t1; 

t2=a'fb; ' 


t3=t2+i; 


a=t3; 

tZsa+b; 

i=i+1; 

t3=t2+i; 

JMP LI; 

a=t3; 

i=i+1; 


JMP L1; 

c=a/20; 



c=a/20; 

Figure 1: Basic Blocks. 
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2.3 Control-Flow Graph 

Flow-of-control in a program means the possible execution sequences of the program 
in run time. We can add the fiow-of-control information to the set of basic blocks to 
get a Control-Flow Graph. The flow-of-control information forms the directed edges 
of the control-flow graph and the basic blocks are the nodes of the graph. 

One node of the control-flow graph is distinguished as the initial node and the 
program execution starts from the ‘initial’ node. If there is a directed edge from the 
basic block Bi to B2, then the execution of the basic block B2 can follow that of Bi 
in some execution sequence. This can only happen when; 

1 . There is a conditional or an unconditional jump statement from the last state- 
ment of Bi to the first statement of B2. 

2 . B2 immediately follows Bi in order of program and there is no unconditional 
jump from the last statement of Bi. 


Then Bi is said to be the predecessor of B2 and B2 is the successor of Bi. 
Figure 2 shows the control-flow graph of a simple loop statement. 


1 = 0 ; 

LI: 

t1=(l==10); 
JZR L2,t1; 
t2=a+b; 
t3=t2+i; 
a=t3; 
i=i+1 ; 

JMP LI; 
L2: 

c=a/20; 


[ '=P» J 


1 


t1={f==10); 
JZR L2,t1; 

I 




t2=a+b; 
t3=t2+i; 
a=t3; 
i=i+1 ; 

JMP L1 ; 







£ 


r c=a/20; 


Figure 2 : Control-Flow Graph. 
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2.4 Data-Flow Analysis 

Data-flow analysis is the process of ascertaining and collecting information about 
the possible run-time modifications and usage of certain quantities in the program. 
Since data or values of variables’ are the fundamental information that the program 
manipulates, usually the values of variables are taken as the quantities under scrutiny 
in data-flow analysis. 

We distinguish among several levels of data-flow analysis: 

(a) statement 

(b) basic block (intrablock) 

(c) procedure (intraprocedural) and 

(d) program (interprocedural) 

Some authors describe statement level and basic block level analysis as local 
data-flow analysis and intraprocedural and interprocedural analysis as global data- 
flow analysis. 

Data-flow information is collected hierarchically. Statement level analysis in- 
volves analysis for a single statement. In this kind we calculate what variables have 
their values modified, preserved and used, if control proceeds through the state- 
ments. Statement level analysis is used in basic block level analysis. We examine 
each of the statements in a basic block in order and calculate similar information 
for the block. Basic block level information along with control-flow graph leads to 
information about the a procedure. And with similar information about procedures, 
we can construct information about the whole program in program level analysis. 
However, collection of intraprocedural information through control-flow is somewhat 
complicated. Even information about particular indivisual statement, such as ‘call’ 
statement, may not be immidiately available. 

2.5 Global Data-Flow Analysis 

Reaching definition and live variable analysis are the two typical applications of 
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the information generated through global data-flow analysis. We shall discuss only 
reaching deflnition analysis as the basis of impact analysis which is easily obtained 
from the reaching deflnition analysis. 

Here we recall a few deflnitions which will be used in further discussion. 


2.5.1 Definitions 

1. Point: A point in a basic block is the possition between two adjacent state- 
ments, as well as the position before the flrst statement of the block and the 
last statement of the block. 

If there are n statements Si,S 2 , . Sn in a basic block, there are n -1- 1 points 
Pi,P2, ■ ■ -Pn+i in the block, such that. 


Pi+i is in between Si and Sj+i; Vi : 1 < i < (n — 1) 
and Pi is before Si and Pn+i is after s„. 

Figure 3 shows the points in a basic block. 


P1- 

P2- 

P3- 


P4- 


i=m+l; 

-• 


J=n; 


a=ul; 


Figure 3: Points in a Basic Block. 


2. Path in the Control-Flow Graph: A path from point pi to is a sequence of 
points Pi,P2,-- -Pn+i such that for each i, 1 < i < n either of the following 
two conditions holds: 

• Pi is the point immediately preceding a statement and pi+i is the point 
immediately following that statement in the same block. 

• Pi is the last point of same block and pi+i is the first point of a successor 
block. 
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Figure 4 shows a path in control-flow graph from point pi to pi2 and the points 
It passes through can be given as : bi>P2,P3,P4,P5,P6,P7, P8,P9,Pio,P3,P4,P5,Pii,Pi2] 



Figure 4 : Path in a Control-Flow Graph. 


3 . Definition of a variable: A definition of a variable a: is a statement that assigns 
a value to x. 

4. Use of a variable: Use of a variable x is the statement where x occurs at the 
right hand side as an operand. 

Consider the following piece of code in 3 -address representation. 

1. i = 0; 

2. k = 10; 

3. X = 5.0; 

4. b = 2.25; 

LI: 

5. tl = k - i; 

6. JNZ L2,tl; 

7. t2 = X + b; 
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8. t3 = 50.0 - x; 

9. X = t3; 

10. t4 = i + i; 

11. i = t4; 

12. JMP LI; 

L2; 


Here the definitions of the variable i are at statement nos 1 and 11 and uses 
of i are at statement nos 5 and 10. Similarly, definitions of variable x are at 
statements 3 and 9 and its uses are at statement nos 7 and 8. 

2.5.2 Global Data-Flow Equations 

Information about data or value of variables, which are ascertained from data-flow 
analysis are represented as set a of variables. Typically, there are 4 kinds of sets 
associated with every node of a control-flow graph. 

• in[B] is the set of definitions that enters the basic block B. 

• out[B] is the set of definitions that comes out of basic block B. 

• gen[B] is the set of definitions generated within the basic block B. These are 
actually locally exposed or locally generated definitions for node B. This is the 
set of definitions created by B as far as the outside world is concerned. 

• Killing of a definition: A definition d of a variable x is said to be ‘killed’ at point 
p along a path starting from the point immidiately following d to the point p, 
if along that path there is another definition of x. The definition d is live at p 
if d is not killed along any path from the point immidiately following d to p. 

• kill[B] is a set of definitions being killed in the basic block B. 
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2.6 Reaching Definition Analysis 

We say a definition d reaches a point p if there is a path in control-flow graph from 
the point immidiately following d to p such that d is not ‘killed’ along that path. 

2.6.1 Data-Flow Equations for Reaching Definition Analysis 

1. For each basic block B: 

m[B] = (J out[x] 

x€PRED[B] 

If PRED[s] = $ then m[s]=$. 

2. For each basic block B, 

out\B\ = gen\B] U (m[B] — killlB]) 

3. Combining 1 and 2, 

ont[B] = gen[B] U (( (J out[x]) — kiU[B]) 

xePREDlB] 

If PRED[s] = $ then m[s]=$. 

• In the above equations PRED[B] denotes the set of predecessor blocks 
of basic block B. 

Since equations 1 and 2 do not necessarily have a unique solution for out, we desire 
the smallest solution. 


2.7 du-Chain 

The construction of du-chain or definition-use chain, for a definition d of a variable 
X, involves identification of all the uses of x, where d reaches. For a use of the 
variable x in statement s, we calculate whether the definition d reaches from d to 
s along any path. If so then s is a part of the du-chain of d. Definition-use chain 
of definition d includes d as the head of the chain and followed by all the uses of x, 
where d reaches. 
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2.8 Impact Analysis 

A definition d of a variable x signifies a piece of data being associated with variable 
X. The du-chain of a definition d signifies the association period of the data/value 
with X. In the current application we want to trace the complete life-period of a 
data which may be associated with more than one variable. A data value can be 
replicated to other variable through the following: 

(a) Assignment to other variables and 

(b) Used as argument of a ‘call’ to function. 

An assignment statement repliacates and associates data with the variable to 
which the assignment is done. Similarly the formal argument of a ‘call’ statement 
replicates the value to the dummy argument of the function being called. The impact 
analysis technique tries to capture the complete fiow of a data/ value throughout the 
program. 

Impact analysis technique is essentially based on the reaching definition analysis 
technique and the information generated by it is stored as an impact chain, which 
is an extension of the du-chain. 

2.8.1 Definitions 

• Seed Variable is a variable for which impact analysis is to be done. 

• Seed Definitions are the defintions involving the seed variable. 

2.8.2 Impact Analysis and Impact Chain 

The concept of impact analysis technique of a seed definition is recurrsive. At the 
basic step we do the reaching definition analysis of the seed defintion. As a result we 
obtain a du-chain of the seed definition. Impact analysis is done on the definitions 
in the du-chain. The definition of impact chain of a seed definition d is consequently 
recurrsive and is given as follows: 
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1. Definition d is the head of the impact chain. 

2. The ‘uses’ of d are part of the impact chain. 


(a) d' is a definition which uses d in the chain. 

(b) impact chain of d' is also in the chain. 


The impact chain of a definition may be a chain larger than the corresponding 
du-chain. For example, consider the following piece of code, where we want to find 
the impact chain of the variable a. 


01 : 

int i,a,b; 

/* Declarations of i,a and b */ 

02 : 

a=0; 

/* Definition of a */ 

03 : 

o 

ii 

/* Definition of b */ 

07 : 

f or (i=a ; i<=20 ; i++) { 

/* Use of a and 



definition and use of i */ 

13 : 

} 


14 : 

b=a+10; 

/* Use of X */ 

15 : 

printf ("Enter a value:"); 



/* Next definition of x */ 

/* Use of a and b */ 

The du-chain and the impact chain of the definition of the variable a at line no. 
2 is shown in the figure 5. 

■ Impact Analysis Algorithm 

The algorithm tor impaot analysis, based on the reaching definition analysis is given 
in the following page: 


16 : scanf("y.d",&x); 

17 : if(a>b) { 
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Algorithm : Impact analysis, based on reaching definition analysis. 
Input : The control-flow graph of the program, the seed variable x. 
Output : Impact chains corresponding to the seed variable. 
Method : 

Step 1: 

Find syntactically all the occurrences of the 
definition of the seed variable x. Call them 
seed definitions. 

Step 2: 

For each ‘Seed Definition’ s do 

{ 

R = { J I J is a definition and 

the seed definition s reaches I 
and I uses x }; 

P = $; 

For each definition d in {R — P) do 

{ 

V = variable defined in d; 

R! = { / I / is a definition and 

d reaches I and I uses v }; 

R = PUP'; 

P = Pu{d}; 

} 

output P; 

} 
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I Proof of Convergence: 

Proof: The proof of convergence of the impact analysis algorithm uses the con- 

vergence of reaching definition algorithm. 

It is easy to see that step 1 of the algorithm converges. Step 1 parses the 
program once and collects all the definitions of x where x syntactically matches the 
seed variable. The number definitions it collects is at most equal to the number of 
definitions in the program and since the number definitions in a program is finite, 
|jR| is finite. Therefore, step 1 runs for a finite amount of time. 

R is the set of seed definitions collected in step 1 and \R\ is finite. P is the 
set of definitions for which du-chain construction has been done. Initially P is 
empty. In each iteration of the ‘for’ loop in step 2, the algorithm computes the 
du-chain for a definition d, where d is an element oi{R-P). (R-P) is the set of 
definitions for which du-chain construction has not yet been done. Within the ‘for’ 
loop computation of R' uses the reaching definition analysis algorithm. Since the 
reaching definition analysis algorithm takes a finite time; each iteration of the loop 
also takes a finite mount of time. It is to be observed that both \R\ and 1P| never 
decreases. In each iteration |P| increases by 1. Since i? is a set of definitions and at 
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most it can be the set of all definitions in the program, which is finite. Eventually, 
\R—P\ will be 0 and the iteration will terminate. Therefore, the number of iterations 
of the loop is finite. 

This proves that the algorithm converges. | 

I Few Definitions 

• Impact Variable: Variable x is said to be an Impact Variable, if a definition of 
X is in the impact chain. 

• Impact Points: All the definitions and the uses in a impact chain are called the 
Impact Points. 
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Chapter 3 


Design Approach of a 
Power-Editor based on Impact 
Analysis Technique 

3.1 Functionalities of Power-Editor 

We have developed an impact analyzing tool which runs at the back-end of an 
editor. The tool is interfaced with the editor. The interface allows the users to 
use the impact analyzing tool along with the normal editing facilities of the editor. 
The editor and the impact analyzing tool is packaged together and is called the 
Power-Editor. The Power-Editor has the following functionalities: 

1. All the interactions between the impact analyzing tool and the user is done 
through the editor. 

2. The user can invoke the impact analyzer from the editor. When invoked the 
tool asks for the seed variable, does impact analysis on the program in the 
current buffer of the editor and produces the impact chains for the specified 
seed variable. 

3. The user can browse through the impact chain with simple key-strokes and 
the editor takes the cursor to the corresponding positions in the program. The 
user can edit the program simultaneously. 
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3.2 Implementation Strategy 

The impact analyzing tool has two main modules, the Impact Analysis Engine{iPIE?j 
and the Impact Chain Pvocessov(lCP). The lAE has two submodules, namely Inter- 
mediate Representation Generator{lRG) and Impact Analyzer. 

Figure 6 shows the modules and their interfaces. 



Figure 6: Modular design of the tool. 


3.3 Discussion on the Design Modules 

3.3.1 Intermediate Representation Generator (IRG) 

This module produces an equivalent intermediate representation of the input pro- 
gram in 3-address intermediate code. IRG uses the front ends of compiler i.e. the 
lexical analyzer and the parser. There is a specified format in which the interme- 
diate code is stored in a temporary file and it is used in the subsequent phases of 
the tool. The IRG expects a correctly written program as input and produces the 
corresponding 3-address intermediate code as output. 

3.3.2 Impact Analyzer 

This is the core module and performs impact analysis on the input code which is 
in 3-address intermediate representation. The input to this module is produced by 
IRG. The impact analyzer takes a seed variable from the user and produces impact 
chains for all the seed definitions. The impact of the seed variable can propagate 
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from the main program unit to other functions, possibly to some other files {e.g. 
included files) through function calls. The impact analyzer is able to follow such 
data flows. 

The impact analyzer performs the followings in order: 

1. Generates the basic blocks from the input. 

2. Adds control flow information to the basic blocks and produces the control-flow 
graph. 

3. Takes a seed variable and finds out all the seed definitions. Then runs impact 
analysis algorithm and produces the impact chains corresponding to the seed 
definitions. 

I Implementation 

The implementation has a small deviation from the impact analysis algorithm. Con- 
sider, some impacted variable is used as an argument in a ‘call’ statement. Let 4 
be definition of the dummy parameter of the called function to which the argument 
value to be assigned. Obviously 4 is impacted through the ‘call’ statement and we 
can obtain an impact chain for 4- The impact analysis algorithm implies that the 
impact chain of da is a part of the main impact chain. But the implementation stores 
it as a separate impact chain. This deviation although increases the no of impact 
chains, it reduces computation. But the total number of distinct impact points in 
both the cases remains the same. The implementation restricts each impact chain 

within one function unit. 
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Input : The control flow graph of the program. The seed variable ‘x\ 
Output : ‘Impact Chain’. 

Method : 


Define the following sets : 

Declarations : 

Ds = Set of definitions which are the seed definitions 
for impact analysis. 

InstTs — The ‘Seed Definition’ which is currently under 
scrutiny. 

SeedVar = Variable defined in /nstrs. 

M = Set of statements which are already included 

in the impact chain corresponding to the current 
seed definition. 

Is = Set of statements which are impacted by Instrg. 
hall = Set of definitions which are impacted through a 
‘call’ to a function by the current seed variable. 

IQ^ = Queue of seed definitions. 

IQ^ = Queue of statements impacted. 

Algorithm : 

Ds = { / I / is an statement and y is defined in I and 
y syntactically matches the seed variable x }; 

For all element e in Ds, enqueue ein IQ g] 

while (IQs not empty) do 

{ 

InstTs = dequeue from IQg] 

SeedVar = variable defined in InstTs; 

5 = { 7 I / is an statement and InstTs reaches 
I and 7 is a use of SeedVar}; 


29 



^caii — { / I / is the first statement of the header basic 
block of function ‘func’ and the definition Instr^ 
is used to ‘call’ function ‘func’}; 

M = MU/ca«; 

I call ~ I call 

for each statement e in Icaii enqueue e in 
Ds=Ds[jIcaU\ 

M = M\JInstrs\ 

M = MU5; 

for all element e in 5, enqueue e in 
while {IQi not empty) do 

{ 

li = Dequeue from IQf, 

SeedVar = variable defined in If, 

5" = { / I / is an statement and /j reaches I 
and y is defined in I and 
SeedVar is used in L }; 

N = S-M-, 

Enqueue each element of in JQi; 

M = M[JS; 

I call = { -^ 1 InstTs is used to ‘call’ a function / 

and InstVs is the formal argument for some 
parameter a; of / and x is declared at /}; 

M = MU/cau; 

I call ~ I call -Dg, 

for each statement e in Icaii enqueue e in 
Ds ~ U Icaii ) 

} 
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3.3.3 Impact Chain Processor(ICP) 

The ICP provides an interface with an editor. The ICP maintains the impact chains 
in its internal list data structure and provides an abstract view to the user. Each 
impact point in the impact chain is a position in some program file and the ICP 
takes the cursor to that point in the file when the user wants to see that point. So, 
while browsing through the impact chains the user only sees the cursor taken to the 
points in program where the seed variable has its impact. ICP automatically loads 
a file if the impact point refers to a position in some other file. With simple key 
stokes the user can browse through the impact chains in both forward and backward 
directions and can edit the programs simultaneously. 
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Chapter 4 

Results and Discussion 


4.1 Development Environment 

The Power-Editor was developed under the UNIX environment. The assumptions 
taken for implementation are as follows: 

1. Input programming language ANSI C. The grammar of the ANSI-C has been 
taken from Compiler Design in C [12] and has been modified. 

2. The query language is SQL. The grammar is written by Leroy Cain taken from 
an web-achieve [5]. 

3. The impact analyzing tool analyses the program at the code level. It assumes 
the programs in ANSI-C with embedded ESQL statements. Therefore, the tool 
works for databases where programming in this combination is permitted. 
Oracle are examples of this kind. 

Tools used for implementation are: 

1. Lex for lexical analysis [16]. 

2. YACC for parsing [13]. 

3. The preprocessor of ANSI C compiler. 

4. The interfacing is made with the gnu Emacs editor. ICP is written in eLisp or 
emacs Lisp [17, 20] to talk to the editor. 
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4.1.1 Code Size 


• Code length: 


Tools 

Intermediate Representation Generator 
Impact Chain Generator 
Interface with Editor 
Browser (eLisp code) 

• Tool Size (binaries): 


2683 loc 60858 bytes. 

4163 loc 108665 bytes. 

1952 loc 76304 bytes. 

527 loc 11493 bytes. 

366 loc 8811 bytes. 


Intermediate Representation Generator 
Impact Analyzer+Interface 
Browser (eLisp byte-code) 


352256 bytes. 
221184 bytes. 
6467 bytes. 


4.2 Test Results 


The tools has been tested on few moderately large ANSI-C programs with consider- 
able complexity. The test results are furnished in table 1. The test results are taken 
on DEC Alpha machine. 


4.2.1 Capabilities of the Impact Analyzing Tool 

• The tool identifies variables impacted by the seed variable. The seed variable 
can be any variable name, including a date- variable. In this sense it qualifies 
as a tool for the Year2000 problem. Otherwise it is a generalized tool. 

• The tool assumes the input program to be correct. 

• At the current stage, impact point identification through aliasing and pointers 
is not implemented. 

• The IRG module of the tool requires further debugging for programs with 
embedded SQL statements. 

Figure 7 through Figure 13 give snap-shots of the tool at work. 

Appendix B shows some impact chains for different seed variables. 
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Program size 
(loc) 

No. of files 
included 

Chain 

length 

CPU time 
(sec.) 

Real Time 
(sec.) 

2294 

15 

16 

0.52 

1.16 

2294 

15 

37 

0.65 

1.21 

2294 

15 

10 

0.46 

1.01 

2294 

15 

18 

0.45 

1.18 

2294 

15 

28 

0.50 

1.15 

2618 

17 

5 

0.25 

0.88 

2618 

17 

2 

0.24 

0.78 

3166 

18 

25 

0.39 

1.02 

3398 

15 

3 

0.51 

1.11 

3398 

15 

7 

0.49 

1.10 

3398 

15 

26 

0.53 

1.05 

3398 

15 

9 

0.47 

1.09 


Table 1: Test Results 


Jlnclude <stdio*h> 

|#include "matrix^inc" 

tdefine N 8 

Ity pedef int junk; 

UainO 

k 

float ICNKN3; 

float H1CN3[N],H2CN]I:N],HI:NKN3; 
float yli:N/2],y2i:N/21,y3CN/2],y4CN/2]; 
float yCN]; 

float tmpm/2m/2h 
float taMp2i:N/2KN3; 
float BaaisC2]I2l; 
junk l,j; 

f or < i=0 j i<l^/2; i++) 

f or < j=sO; j<Nij j++) temp2C i 3 C j 3=0.0; 
Mak©IJiaHatrix<8.<BasisC03E03) ,2,1.0) ? 

MatrixGroy <8.<templE03C03 ) ,N/2,&<BasisC0] CO] ) >; 

CopyHatrix<8»C I C03 [03 ),N,N,8«(templC03 CQ3 ) >N/2,N/2); 
Matrixeol8pipand<«.<IC03C03>^N/2,N/2.&<templE03C03>,N/2,^N^^ 
CopyHatrix<Mtemp2C03C03>,N/2,N,6.(templI03[03>,N/2.N^^^^ 
HakeIliaNatrix<«.<BasisiE03£03 > ,2, C-1.0) ) ; 

Hatr ixGrow(&<templE03 E03 ) , N/2,8«CBasisE03E03» ; 
HatrixCol8ppend(&<tewp2E03E03>,N/2,N/2,8»(teinplE03E03),W/2,N/2>; 


Figure 7: The main C file opened in gnu-emacs editor. 
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Figure 9: The first impact point. 
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xemacs: Emacs @ csealptia3 BBS 





Figure 12; Variable impacted through function call. 



xtimacs: Eniacs @ cseiMf)ha3 


int eolb; 

•C 

int 

if <rowa=:a:rowb> 

< 

for < i aO J i <ro wb ;!+■►> 
for<J=0;J<colb;J++> 

«»»<«'^i>*‘<cola’»'Colb>+< J+ooIa)>=(*<b+< i#colb)+j ) > ; 

> ©Ise 

fprlntf<atdarr,"<MatrixAppend> : Matirces are incompatible's 
for Col AppendSn"); 


CopyHatrix<a,rowa,cola^b,rowb,colb> 

float *1^; 

int rowa; 

int cola? 

float ' 

int rowb; 

int oolb; 

< 

int i^j; 

if <<rowa>arowb) <cola>=CQlb)> 

< , , ^ 

f or < i aO ; i <rouib ; i ++ > 

for<J=0;j<colb5;j++) 


h Ef 


matr- i x ♦ i nc 


ndariiental 


CChain 2) Showing file */matrix*inc<Variabla = a> 


Figure 13: The next impact point in the function. 



Chapter 5 
Conclusion 


In this thesis an analysis method has been developed for in-depth analysis of pro- 
grams at the code level. This technique is called the Impact Analysis method. The 
analysis procedure is based on the data-flow analysis, a method commonly employed 
in the code optimization phase of the compilers. In the control-flow graph of a pro- 
gram, the technique traces the complete path of data/values, originating from a 
specified variable, called the seed variable. The data in its data-flow path may be 
associated with variables, other than the seed variable. 

The impact analysis method has been fruitfully employed to make a better tool 
than the existing analysis-editing tools for the Year2000 problem. The existing tools 
do address-based analysis. They search programs syntactically for variables which 
have a pre-defined pattern and declare them as ‘date- variables’. The other variables 
are ‘non-date- variables’. Therefore, these tools are primarily pattern-matchers. The 
main weakness of ‘address-based’ analysis technique is that it fails to identify those 
non-date- variables, which during program execution may hold a date-value. In con- 
trast to the ‘address-based’ analysis the impact analysis is a value-based analysis 
procedure. Impact analysis traces a ‘value’, which can be a date- value, and reveals 
the variables and the places in the program which is impacted by that value. More- 
over, the analysis method systematically reduces the search space as it checks only 
those places in the program where the flow-of-control can reach. 

We have implemented a Power-Elditor which along with the normal editing facili- 
ties, has the capability to carry out impact analysis on program in the editor. Impact 



analysis can be carried out on programs written in ANSI-C with ESQL statements. 


5.1 Future Work 

At the present implementation the tool does not automatically find the date- variables 
from the program. The tool can be extended to include a pattern-matcher for au- 
tomatic identification of date-variable. At the current state the tool cannot find 
impact points through aliasing and pointers and can be modified to take them into 
account. 
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Appendix A 


User’s Manual for Power-Editor 


A.l About the Power-Editor 

The Power-Editor is based on GNU-emacs editor. It provides extra functionality of 
doing impact analysis on the program in the current buffer of the editor. The user 
can invoke the impact analyzing too^(which does impact analysis) from the editor. 
When invoked the editor asks for a seed variable and brings out the impacts of the 
given variable. The user then can browse through the impact points using simple 
key-strokes and at the same time avail the normal editing functionalities of the 
editor. 


A. 2 How to use the Impact Analyzing Tool 

Suppose the facility is available in the directory : 


/usr/lib/ emacs/lisp 

then include the following lines in your . emacs file. 

(setq load-path 
(append 

(list nil 

"/usr/lib/emacs/lisp" 
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load-path) ) 


) 


(require 'ped) 

(autoload 'ped "ped" nil t) 


(setq emacs-lisp-mode-hook 
(function (lambda () 
(require ’ped) 

(ped) ) 

) 

) 


Now, to invoke the tool the user have to load the file in the emacs editor and 
then invoke the function Analysis (Meta-x Analysis). The tool will ask for a 
seed variable. It then analyzes the program loaded in the emacs’ current buffer and 
constructs the impact chain. Once the analysis is over the cursor shows the first 
impact point. The user then can edit the program. At this point the user can avail 
the facility of the following functions to browse through the impact points. Their 
functionalities are given along with. The user can as well use the editing facilities 
of the editor simultaneously. 


Next 

Prev 

Next-Node 

Prev-Node 


Takes the cursor to the next impact point. 
Takes the cursor to the previous impact 
point. 

It asks for a number and takes the cursor 
that many impact point forward. 

It asks for a number and takes the cursor 
that many impact point backwards. 


To facilitate handling of these function, one can map them to some keys (as 
normally done with emacs). For example, here a set of key-maps definitions that 
binds the functions to the corresponding keys in the global key-map table. 
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(define-key global -map "\C-ca" ’Analysis) 

(define-key global -map "\C-cn" ’Next) 

(define-key global-map "\C-cp" ’Prev) 

(define-key global -map "\C-cf" ’Next -Node) 

(define-key global-map "\C-cb" ’Prev-Node) 

if you include them in your .emacs file, you can get the following effects: 

Cntl+c a ; Asks for a seed variable and 
constructs the impact chain. 

Cntl+c n : Takes the cursor to the next 
impact point. 

Cntl+c p : Takes the cursor to the previous 
impact point. 

Cntl+c f : Asks for a number and takes the 
that many impact point forward. 

Cntl+c b ; Asks for a number and takes the 
that many impact point backward. 

Note: The impact analysis expects only correctly written programs in ANSI-C 
with embedded ESQL statements. 
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Appendix B 


Test Data 


B.l Test Program 

The impact analyzing tool has been tested on the following program. The program 
has an included file, which also has been listed below. The test results for 2 different 
variables has been given in the next section. 


/ 5^ 5(C a|c 5|C :|c s|c 5l<: 5i< SK 5fc ** 5|c 5|C ^ 5|c * 5|C 5|C 5|c sic 3ic * sfc :|e * J|< % 5|C ** 5|e ******* / 
/* */ 

/* File : zed.c (main file) */ 

/* */ 


#include <stdio.h> 
#include "matrix. inc" 


#define N 8 

typedef int junk; 

mainO 

{ 
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float I [N] [N] ; 

float HI [N] [N] ,H2[N] [N] ,H[N] [N] ; 
float y 1 [N/2] , y2 [N/23 , y3 [N/2] , y4 [N/2] ; 
float y [N] ; 

float tempi [N/2] [N/2] ; 
float temp2 [N/2] [N] ; 
float Basis[2][2]; 
junk i,j; 

f or (i=0 ; i<N/ 2 ; i++) 

for(j=0; j<N; j++) temp2[i] [j]=0.0; 

MakeDicLMatrix(&(Basis[0] [0]),2,1.0); 

MatrixGrow (& (tempi [0] [0]) , N/2, & (Basis [0] [0])) ; 

CopyMatrix(&(I [0] [0] ) , N,N,& (tempi [0] [0] ) , N/2, N/2) ; . 
MatrixColAppend(&(I[0] [0]) , N/2, N/2, ft (tempi [0] [0] ) ,N/2,N/2) ; 
CopyMatrix(&(temp2[0] [0]) ,N/2,N,&(templ [0] [0]) , N/2, N/2) ; 
MakeDiaMatrix(& (Basis [0] [0] ) ,2, (-1 . 0)) ; 

MatrixGrow (& (tempi [0] [0]) ,N/2,&(Basis[0] [0])); 
MatrixColAppend(&(temp2[0] [0]) , N/2, N/2, & (tempi [0] [0]) , N/2, N/2) ; 
MatrixRowAppend (& (I [0] [0] ) , N/2 , N , & (temp2 [0] [0] ) , N/2 , N) ; 
MatrixPrint (fed [0] [0]) ,N,N) ; 


Basis [0] [0]=1.0; 

Basis [0] [1]=1.0; 

Basis[l] [0]=1.0; 

Basis [1] [1] = (-1.0) ; 

MatrixGrow (fe (tempi [0] [0]) ,N/2, & (Basis [0] [0])) ; 
CopyMatrix(&(Hl[0] [0] ) ,N,N,&(templ [0] [0] ) , N/2, N/2) ; 
MakeDiaMatrix (& (tempi [0] [0] ) , N/2 , (0 . 0) ) ; 

MatrixColAppend(&(Hl[0] [0]) , N/2, N/2, & (tempi [0] [0] ) ,N/2,N/2) ; 
CopyMatrix(&(temp2 [0] [0]) , N/2, N,& (tempi [0] [0]) , N/2, N/2) ; 
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MatrixGrow (& (tempi [0] [0]) ,N/2,&(Basis[0] [0])) ; 
MatrixColAppend(&(temp2[0] [0]) ,N/2,N/2,&(templ[0] [0] ) ,N/2 ,N/2) ; 
MatrixRowAppendC&CHlEO] [0]) ,N/2,N,&(temp2 [0] [0]) ,N/2,N) ; 
MatrixPrint(&(HlCO] [0]),N,N); 

MatrixMult(&(I[0] [0]) ,N,N,&(H1[0] [0] ) ,N,N,&(H[0] [0])); 
CopyMatrix (& (H2 [0] [0] ) , N , N , & (H [0] [0] ) , N , N) ; 

MatrixMult(&(Hl[0] [0] ) ,N,N,&(H2[0] [0] ) .N,N,&(H[0] [0] )) ; 
CopyMatrix (& (H2 [0] [0] ) , N , N , & (H [0] [0] ) , N , N) ; 

MatrixPrint (& (H2 [0] [0] ) , N , N) ; 


printf ("Enter the Y1 Matrix \n") ; 

MatrixRead(&(yl [0] ) ,N/2, 1) ; 

printf ("Enter the Y2 Matrix \n") ; 

MatrixRead (& (y2 [0] ) , N/2 , 1) ; 

printf ("Enter the Y3 Matrix \n") ; 

MatrixRead (& (y3 [0] ) ,N/2,1) ; 

printf ("Enter the Y4 Matrix \n") ; 

MatrixRead (& (y4 [0] ) ,N/2,1) ; 


MatrixSub(&(ylC0]) ,N/2,l,&(y2[0]) ,N/2,1) ; 
MatrixSub(&(y3[0] ) ,N/2,l,&(y4[0]) ,N/2,1) ; 

CopyMatrix (& (y [0] ) ,N,l,&(yl[0]) ,N/2,1) ; 
MatrixRowAppend (& (y [0] ) , N/2 , 1 , & (y3 [0] ) , N/2 , 1) ; 

MatrixMult (& (H2 [0] [0] ) , N , N , & (y [0] ) , N , 1 , & (H [0] [0] ) ) ; 
MatrixPrint (& (H [0] [0] ) , N , 1) ; 


for(i=0;i<N;i++) 

H[0] [i]=H[0] [i]/N; 
MatrixPrint (& (H [0] [0] ) , N , 1) ; 

> 
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/***5|C5jc:|C5|C3(C5|c:<<SjC5(C5fC5(<5jC5jC5jC5|C5jC5i<:<C5|C5|C5jc:4s******5|<****>f!**5f:******/ 

/* */ 

/* File : matrix. inc (included file) */ 

/* */ 

#ifndef __MATRIX_INC__ 

#define __MATRIX_INC__ 

MatrixRead (a , row , col) 
const float *a; 
int row; 
int col; 

■C 

int i , j ; 
float val; 

printf ("\t") ; 

for(j=0; j<col; j++) printf (" '/.d\t",j+l); 

printf ("\n") ; 

for(i=0;i<row;i++) 

■C 

printf ("Row */.d\t" , i+1) ; 
for(j=0; j<col; j++) 

■C 

scanf ("’/.f " ,&val) ; 

*(a+(i*col)+j)=val; 

} 

printf ("\n") ; 

> 
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} 


MatrixPrint (a, row , col) 
float *a; 
int row; 
int col; 

int i,j; 

printf ("\t") ; 

for(j=0; j<col; j++) printf (" ‘/,d\t" , j+1) ; 

printf (" \n"); 
f or (i=0 ; i<row; i++) 

printf (" "/,d\t" , i+1) ; 

for(j=0; j<col; j++) printf (" ‘/,3.2f\t" ,*(a+(i*col)+j)) ; 
printf ("\n") ; 

} 

} 

MakeDiaMatrixCa, size , val) 
float *a; 
int size ; 
float val; 

■C 

int i,j; 

f or (i=0 ; i<size ; i++) 
for(j=0; j<size; j++) 

*(a+(i*size)+j)=0; 

f or (i=0 ; i<size ; i++) * (a+(i*size)+i)=val ; 
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> 


MatrixMult (a , rowa , cola , b , rowb , colb , c) 

float *a; 

int rowa; 

int cola; 

float *b; 

int rowb; 

int colb; 

float *c; 

{ int i , j , k ; 

if (cola==rowb) 

f or (i=0 ; i<rowa; i++) 

for(j=0;j<cc,lb;j+*) «fMTI?AL LIBRAM 

5- T.. XAMPUa 

*(c+(cola*i)+j)=0; 

f or(i=0; i<rowa; i++) 125726 

f or (j=0 ; j<cola; j++) 
for (k=0 ; k<colb ; k++) 

* (c+ (i*colb) +k) += ( (* (a+ (i*cola) + j ) ) * (* (b+ ( j *colb) +k) ) ) ; 

> else 

fprintf (stderr," (MatMnlt) : Matrices are incompatible for \ 
Matrix MultiplicationXn") ; 

} 

Matr ixSub (a , rowa , cola , b , rowb , colb) 

float *a; 

int rowa; 

int cola; 

float *b; 

int rowb; 
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int colb; 

{ 

int i,j; 

if ((rowa==rowb) &&(cola==colb) ) 

-C 

f or (i=0; i<rowa; i++) 
for(j=0; j<colb; j++) 

* (a+ (i*cola) + j ) -= (* (b+ (colb+i) + j ) ) ; 

} else 

fprintf (stderr, " (MatSub) : Matrices are incompatible for\ 
Matrix subtract ion\n") ; 

} 

MatrixGrowCa, size, Basis) 
float *a; 
int size; 
float >t<Basis; 

■C 

int row, col; 
int i , j ; 

*(a)=(*(Basis)) ; 

*(a+l)=(*(Basis+l)) ; 

*(a+size)=(*(Basis+2)) ; 

*(a+size+l)=(*(Basis+3)) ; 
row=col=2; 

while (row<size) 

for(i=0;i<row;i++) 
for(j=0; j<col; j++) 
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{ 

*(a+(i*size)+(j+col)) = (*(a+(i*size)+j)) * (*(Basis+l)) ; 

*(a+((i+row)*size)+j) = (*(a+(i*size)+j)) * (*(Basis+2)) ; 

*(a+((i+row)*size)+(j+col)) = (♦(a+(i*size)+j)) ♦ (*(Basis+3)) ; 

> 

row*=2; col*=2; 

} 

> 

Matr ixRowAppend (a , rowa , co la , b , ro wb , colb) 

float *a; 

int rowa; 

int cola; 

float *b; 

int rowb; 

int colb; 

■C 

int i , j ; 

if (cola==colb) 

f or (i=0 ; i<rowb ; i++) 
for(j=0; j<colb; j++) 

* (a+ ( ( i+rowa) *cola) + j ) = (* (b+ (i*colb) + j ) ) ; 

} else 

fprintf (stderr,"(MatrixAppend) ; Matirces are incompatibleX 

for Row AppendXn") ; 


} 

MatrixColAppend(a,rowa, cola, b, rowb, colb) 

float *a; 
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int rowa; 
int cola; 
float *b; 
int rowb; 
int colb; 

int i,j; 

if (rowa==rowb) 

f or (i=0 ; i<rowb; i++) 
for(j=0; j<colb; j++) 

*(a+i*(cola+colb)+(j+cola))=(*(b+(i*colb)+j)) ; 

} else 

fprintf (stderr, " (MatrixAppend) ; Matirces are incompatible\ 
for Col AppendXn"); 

} 

CopyMatrix (a , rowa , cola , b , rowb , colb) 

float *a; 

int rowa; 

int cola; 

float *b; 

int rowb; 

int colb; 

-C 

int i,j; 

if ( (rowa>=rowb) && (cola>=colb)) 

■C 

f or ( i=0 ; i<rowb ; i++) 

for(j=0; j<colb; j++) 
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*(a+(i*cola)+j)=(*(b+(i*colb)+j)) ; 

} else 

fprintf (stderr , " (CopyMatrix) : Matrices Incompatible for Copying 

} 

#endif 


B.2 Impact Chains 

The impact chains are a series of impact points. Each impact points are represented 
according to the following format: 

< “File Name” , Variable Name ( Row No. , Column No. ) 

The seed definitions are labeled as ‘Seed : ’ along the left margin with its uses, 
which follows it. 


Impact Analysis for 
Seed Variable = I 


Seed : <"zed.c" ,I(10,8)> 

<"zed.c",I(23,15)> 

<"zed.c",I(24.20)> 

<"zed.c",I(29,20)> 

<"zed.c",I(30,16)> 

<"zed.cM(47,15)> 

Seed : <". /matrix. inc" ,a(170, 11)> 
<" . /matrix. inc" ,a( 184,3) > 
Seed : <" ./matrix. inc", a(150, 16) > 
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<" ./matrix.inc",a(164,8)> 
Seed ; <". /matrix. inc",a(130, 16) > 

< " . /matrix . inc " , a ( 144 , 7 ) > 
Seed : <". /matrix. inc" ,a(28, 12) > 

< " . /matrix . inc " , a (41 , 45 ) > 
Seed : <". /matrix. inc" ,a(59, 11)> 

<" . /matrix. inc", a (77, 24) > 


Impact Analysis for 
Seed Variable = Basis 


Seed : <". /matrix. inc", Basis (103, 18) > 

<" ./matrix. inc", Basis(122,59)> 
<" . /matrix. inc" , Basis (123, 59) > 
< " . /matrix . inc " , Basis ( 124 , 59) > 
Seed : <". /matrix. inc", Basis(106,7)> 

<" . /matrix . inc" , Basis (111 , 12)> 
<" . /matrix. inc" , Basis (112, 14) > 
<" . /matrix. inc" , Basis (113, 17)> 
<" . /matrix. inc" , Basis (114, 19) > 
< " . /matrix . inc " , Basis ( 122 , 59) > 
<" . /matrix. inc" , Basis (123, 59) > 
<" . /matrix. inc" , Basis (124, 59) > 
Seed ; <"zed.c" ,Basis(16,8)> 

<"zed.c" ,Basis(21, 18)> 

<"zed.c" , Basis (22,34) > 

<"zed.c" ,Basis(26, 18)> 
<"zed.c", Basis (27,34)> 

<"zed.c" , Basis (42, 34) > 

Seed : <"zed. c" ,Basis(33,2)> 
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<"zed.c" .Basis (42, 34)> 
Seed : <"zed. c" .Basis (34, 2)> 

<"zed.c" ,Basis(42,34)> 
Seed : <"zed.c",Basis(35,2)> 

<"zed.c",Basis(42,34)> 
Seed : <"zed.c",Basis(36,2)> 

<"zed.c" ,Basis(37,34)> 
<"zed.c" .Basis (42, 34) > 
Seed : <" ./matrix. iiic",a(46, 14) > 

<" ./matrix. inc" ,a(55,3)> 

<" . /matrix . inc" ,a(56,26)> 
Seed : <" ./matrix . inc" ,a(103, 11)> 

<" ./matrix. inc" ,a(122,8)> 
<" . /matrix . inc" ,a(122,39)> 
<". /matrix. inc", a(123, 8) > 
<" ./matrix. inc", a(123,39)> 
<" ./matrix. inc" ,a(124,8)> 
<" ./matrix. inc", a(124, 39) > 
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